home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
icon
/
newsgrp
/
group00b.txt
/
000117_icon-group-sender_Fri Oct 27 08:22:34 2000.msg
< prev
next >
Wrap
Internet Message Format
|
2001-01-03
|
3KB
Return-Path: <icon-group-sender>
Received: (from root@localhost)
by baskerville.CS.Arizona.EDU (8.11.1/8.11.1) id e9RFMNB20153
for icon-group-addresses; Fri, 27 Oct 2000 08:22:23 -0700 (MST)
Message-Id: <200010271522.e9RFMNB20153@baskerville.CS.Arizona.EDU>
X-Sender: whm@mail.mse.com
Date: Fri, 27 Oct 2000 01:44:04 -0700
To: icon-group@cs.arizona.edu
From: "William H. Mitchell" <whm@mse.com>
Subject: Re: Yet another Newbie question....
Errors-To: icon-group-errors@cs.arizona.edu
Status: RO
Content-Length: 2366
This approach comes up from time to time on this list, but for the
scanning-challenged a lot of text parsing can be done with a procedure that
breaks apart strings at delimeters. Lots of folks have written such a
thing. The one I wrote is called split, perhaps after the p*rl analog -- I
forget. A clipping about split from some lecture notes I wrote on Icon
follows below. Note that the pairs of lines beginning with "][" are
interaction with an Icon expression evaluator.
-----------
The procedure split(s, delims) returns a list consisting of the portions of
the string s delimited by characters in delims:
][ split("just a test here ", ' ');
r := L1:["just","a","test","here"] (list)
][ split("...1..3..45,78,,9 10 ", '., ');
r := L1:["1","3","45","78","9","10"] (list)
Consider a file whose lines consist of zero or more integers separated by
white space:
5 10 0
100 50
200
1 2 3 4 5 6 7 8 9 10
A program to sum the numbers in such a file:
link split
procedure main()
sum := 0
while line := read() do {
nums := split(line, ' \t')
every num := !nums do
sum +:= num
}
write("The sum is ", sum)
end
If split has a third argument that is non-null, both delimited and
delimiting pieces of the string are produced:
][ split("520-621-6613", '-', 1);
r := L1:["520","-","621","-","6613"] (list)
A sequence of splits:
s := "a.b x.y p.q";
L1 := split(s, ' ');
L2 := split(L1[2], '.');
----------
Here's the source for split:
#
# split(s, delimiters, keepall) splits the string s into consecutive substrings
# that do/do not consist of characters in the cset delimiters, producing a
# list of strings. If keepall is null, strings consisting delimiters are not
# included in the result list.
#
# If not specified, delimeters defaults to blank and tab, which essentially
# "tokenizes" non-whitespace:
#
# words := split(read())
#
# Author: William H. Mitchell (whm@mse.com) c. 1996
#
procedure split(s, dlms, keepall)
local w, ws, addproc, nullproc
ws := []
/dlms := " \t"
addproc := put
if \keepall then
otherproc := put
else
otherproc := 1
if dlms := (any(dlms, s[1]) & ~dlms) then
otherproc :=: addproc
s ? while w := tab(many(dlms := ~dlms)) do {
addproc(ws, w)
otherproc :=: addproc
}
return ws
end